Ontology driven Semantic Provenance for Heterogeneous Bionomics Experimental Data
نویسندگان
چکیده
Scientific experimental data generated by all the bionomic technologies is characterized by heterogeneity in its representation formats, constituents, and generation processes and, therefore, also in its usage. Using the proteomics domain we demonstrate the important role of provenance information o manage, interpret and analyze experimental data. We present a novel approach that employs an ontology as a knowledge model to automatically create semantic provenance information for high-throughput mass spectrometry (MS) data in the glycoproteomics domain. The Semantic Provenance Annotation of Data in protEomics (SPADE) implementation is based on the ProPreO ontology, a large-process ontology ( ~500 classes, 40 named relationships with 170 class-level restrictions, and 3.1 million instances) that models the complete experimental protocol for MS-based glycoproteomics data analysis. The semantic provenance information created in SPADE enables biologists to query over the semantic provenance information and retrieve exact data using “train-of-thought” expressive queries in SPARQL query language. We also discuss our current work in extending the ProPreO ontology to support toxicological metabolomics experimentation using Nuclear Magnetic Resonance (NMR) spectroscopy. Our strategic goal is to use Semantic Provenance information by pattern recognition and data mining algorithms for comparative or correlation analysis of Liquid Chromatography MS (LCMS) and NMR spectroscopy experimental data as part of toxicological metabolomics studies.
منابع مشابه
Ontology-Driven Provenance Management in eScience: An Application in Parasite Research
Provenance, from the French word “provenir”, describes the lineage or history of a data entity. Provenance is critical information in scientific applications to verify experiment process, validate data quality and associate trust values with scientific results. Current industrial scale eScience projects require an end-to-end provenance management infrastructure. This infrastructure needs to be ...
متن کاملManaging the Deluge of Scientific Data
Provenance information in eScience is metadata that's critical to effectively manage the exponentially increasing volumes of scientific data from industrial-scale experiment protocols. Semantic provenance, based on domain-specific provenance ontologies, lets software applications unambiguously interpret data in the correct context. The semantic provenance framework for eScience data comprises e...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملReflections on Provenance Ontology Encodings
As more data (especially scientific data) is digitized and put on the Web, the importance of tracking and sharing its provenance metadata grows. Besides capturing the annotation properties of data, provenance research also emphasizes interlinking relevant data. Therefore, it is desirable to make provenance metadata easy to access, share, reuse, integrate and reason with. To address these requir...
متن کاملScientific Workflow Provenance Metadata Management Using an RDBMS
Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power...
متن کامل